Hybrid hierarchical clustering: cluster assessment via cluster validation indices
نویسندگان
چکیده
This paper introduces a hybrid hierarchical clustering method, which is a novel method for speeding up agglomerative hierarchical clustering by seeding the algorithm with clusters obtained from K-means clustering. This work describes a benchmark study comparing the performance of hybrid hierarchical clustering to that of conventional hierarchical clustering. The two clustering methods are compared for 16 benchmark data sets based on the cluster validation index signature, an aggregation of several cluster indices. In most cases, the cluster signatures indicate similar clusterings for unseeded and seeded hierarchical clustering.
منابع مشابه
Canonical PSO Based K-Means Clustering Approach for Real Datasets
"Clustering" the significance and application of this technique is spread over various fields. Clustering is an unsupervised process in data mining, that is why the proper evaluation of the results and measuring the compactness and separability of the clusters are important issues. The procedure of evaluating the results of a clustering algorithm is known as cluster validity measure. Different ...
متن کاملHybrid Hierarchical Clustering: an Experimental Analysis
In this paper, we present a hybrid clustering method that combines the divisive hierarchical clustering with the agglomerative hierarchical clustering. We used the bisect K-means divisive clustering algorithm in our method. First, we cluster the document collection using bisect K-means clustering algorithm with K’ > K as the total number of clusters. Second, we calculate the centroids of K’ clu...
متن کاملDetermining the most proper number of cluster in fuzzy clustering by using artificial neural networks
In a clustering problem, it would be better to use fuzzy clustering if there was an uncertainty in determining clusters or memberships of some units. Determining the number of cluster has an important role on obtaining sensible and sound results in clustering analysis. In many clustering algorithm, it is firstly need to know number of cluster. However, there is no pre-information about the numb...
متن کاملExternal Validation Measures for Nested Clustering of Text Documents
This article handles the problem of validating the results of nested (as opposed to ”flat”) clusterings. It shows that standard external validation indices used for partitioning clustering validation, like Rand statistics, Hubert Γ statistic or F-measure are not applicable in nested clustering cases. Additionally to the work, where F-measure was adopted to hierarchical classification as hF-meas...
متن کاملPerformance Validation of the Modified K-Means Clustering Algorithm Clusters Data
In this paper, we present the Modified K-Means Clustering algorithm Analysis and performance, the clustering analysis can be used to partition the cluster data with number of choice clusters and perform each cluster if it can form properly or not and it can pertain by using the silhouette coefficient method. In this one the silhouette coefficient can apply on the group of author’s Hand G-indice...
متن کامل